Optimal Subset Selection for Active Learning

نویسندگان

  • Yifan Fu
  • Xingquan Zhu
چکیده

Active learning traditionally relies on instance based utility measures to rank and select instances for labeling, which may result in labeling redundancy. To address this issue, we explore instance utility from two dimensions: individual uncertainty and instance disparity, using a correlation matrix. The active learning is transformed to a semi-definite programming problem to select an optimal subset with maximum utility value. Experiments demonstrate the algorithm performance in comparison with baseline approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Targeting Optimal Active Learning via Example Quality

In many classification problems unlabelled data is abundant and a subset can be chosen for labelling. This defines the context of active learning (AL), where methods systematically select that subset, to improve a classifier by retraining. Given a classification problem, and a classifier trained on a small number of labelled examples, consider the selection of a single further example. This exa...

متن کامل

Online Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features

Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...

متن کامل

A Parallel Genetic Algorithm Based Method for Feature Subset Selection in Intrusion Detection Systems

Intrusion detection systems are designed to provide security in computer networks, so that if the attacker crosses other security devices, they can detect and prevent the attack process. One of the most essential challenges in designing these systems is the so called curse of dimensionality. Therefore, in order to obtain satisfactory performance in these systems we have to take advantage of app...

متن کامل

A Parallel Genetic Algorithm Based Method for Feature Subset Selection in Intrusion Detection Systems

Intrusion detection systems are designed to provide security in computer networks, so that if the attacker crosses other security devices, they can detect and prevent the attack process. One of the most essential challenges in designing these systems is the so called curse of dimensionality. Therefore, in order to obtain satisfactory performance in these systems we have to take advantage of app...

متن کامل

Binary coordinate ascent: An efficient optimization technique for feature subset selection for machine learning

Feature subset selection (FSS) has been an active area of research in machine learning. A number of techniques have been developed for selecting an optimal or sub-optimal subset of features, because it is a major factor to determine the performance of a machine-learning technique. In this paper, we propose and develop a novel optimization technique, namely, a binary coordinate ascent (BCA) algo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011